(METHOD FOR RETURN INST RUa ION IDENTIFICATION AND ASSOQATED 
METHOD FOR RETURN TARGET POINTER PREDiaiON) 



DESCRIPTION 



Background of Invention 
(Pcral) Field of the Invention 

(Pcra2) This invention relates to a method for predicfing brcnch instruction, 
and more paticulaly to a method for predicting target pointer of return 
instruction in a miaoprooessor cndadigitd signd processor. 

(Pcra3) Desaiption of Related Art 

(Pcra4) A miaoprooessor cndadigitd signd processor for present day both 
utilize multi-stage pipeline system for processing instructions, A pipeline 
comprises stages of fetch, decode, end execute, etc In order to improve 
processing effidency, us udly multiple stages of the pi pdine a e operated 
simultaneously, e.g. when the third stage processes the first instruction, the 
second stage processes the second instruction, end the first stage processes 
the third instruction, instead of thesecondinstruction not bdng processed 
until the first instruction is done\A/ith the pi pd 1 ne, \A/hi I e most stages ae idling 
and westing resources. 

(PcraS) Such a pipeline system works smoothly when instructions ae given 
sequentially, yet when brcnch instruction is given, a problem occurs. As a 
branch instruction is give, the progrcm counter jumps out, such that the 
results obtdnedfrom the previous stages in the pi pdine ae flushed such as to 
ded with taget pointer instruction of the brcnch instruction. That is, the 
process time for previous stages is wasted. 

(Para 6) A technology for predicting taget pointer of brcnch instruction is 
devdoped, dso known as "brcnch prediction". Thepurposeof brcnch 
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prediction is to predict" otaget pointer during fetdn or decode stage of tine 
pi pei i ne, sudn tl^cit tl^e pipeline is cble to process thesubsequent instrucl"ions. 
In a later stage, if ttietaget pointer is predicted correctly, ttie results 
obtained from previous stages ae not wasted, and effidenc/ of eadn stage is 
retdned. 

(Pcra 7) Brcndn instructions ae divided into categories, sudn as direct, 
indirect, relative, absolute, conditiond, end unconditiond, etc Cdl 
instructions for cd 1 1 ng subrouti ne end corresponding return instructions ae 
also belonged to brondn instructions. 

(Pcra 8) Brcndn prediction possesses avaiety. Atraditiond brcndn taget 
buffer is not able to process a more complicated cdl for subroutine. Referring 
to FIG. L on infinite loop of a program section is demonstrated. Providing tt^e 

length) of thiecdl instruction is four bytes, tt^eprogron section is performed 
from thie address 7 700, end tt^ecode 7 700 calls for tt^e subroutine "print" 
located at 7500, end wtien proceeding to the address 7600 performing 

"return", ttie brenchi taget buffer would record edditlondly, end corresponding 
tt^e return instruction address 7600 to ttie return taget pointer 7 704. Ttien, 
ttie address 7200 cdls for subroutine "print" located at 7500, end wtien 
proceeding to return instruction at 7600, tt^e brcndn taget buffer predicts ttie 
next address being at 7 704, yet theactud return taget pointer is 7204, 

indicating a prediction error. Ttiebrendn taget buffer updates ttie return 
instruction address 7 600cor res ponding to ttie return taget pointer to be 
7204, end tt^en wtien next ti mettle address 7 700 cdls "print" for return, ttie 
buffer predicts ttie return target pointer to be 7204, and is still erroneous. 
Wtien multiple addresses st^ae a common subroutine, it leads to continuous 
erroneous prediction wtien epplying treditiond brenchi taget buffer. US patent 
No. 660 7 76 7 provides amettiod, comprising foregoing brenchi target buffer, 

wtiidn makes prediction in fetchi stage in a pipeline, but under asligt^tly 
complicated process does not work accurately. 
(Pcra 9) U.S. patent No. 6425076 furthier provides onott^er methiod, 
providing a plurdity of predicting mettiods, respectively determines reliebility 
end priority, end s a eening a predicting results from a group. T tie drawback to 
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the method is not able to predict" until decoding stage of the pipeline, which is 
delayed, espeddly when multiple fetching stages ae induded in the pipeline. 
(Pcra 10) U.S. patent No. 6609194 further provides a method, comprising 
another method for predicting various types of branch instruction, one among 
which is call/return stack for predicting taget pointer of the return instruction. 
This method also possesses thedrawbcck of delayed prediction, which is 
performed at decode stage of the pipeline. 

(Pcra 1 1) According to thecbovedesaiptions end excmples, a method and 
structure for predsely predicting return instruction at fetch stage in a pipeline 
is desired. 

Summay of Invention 

(Pcra12) The present invention is directed to a method for predicting a 
target pointer of a return instruction, which provides a correct predicting result 
at fetch stage of a pipeline, end being cbleto process a complicated progrcm 
steps. 

(Pcra 13) The present invention is directed to a method for identifying return 
instruction, comprising providing a return target stack at initid, and fetching 
a current instruction; if the current instrtemptempuction is aodi instruction, 
adding the address of the current instruction with a length of the current 
instruction to obtdn a target pointer to be stored to the return target stack. 
Lastly, if an address of the subsequent instruction is identicd to the target 
pointer stored in the return target stack, then the current instruction is a 
return instruction. 

(Pcra 14) The present invention is directed to a method for predicting a target 
pointer, comprising providing a return taget stack end a return instruction 
address teble at initid, end providing a current instruction; if the current 
instruction is aodI instruction, then the adding the address of the current 
instruction with the length of the current instruction to obtain a taget 
pointer to store to eh return taget stack. Then, if the address of a subsequent 
instruction to be fetched after the taget instruction is executed is identicd to 
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thetaget pointer of the stored ir^ the return target stock, the current 
instruction is identified as a return instrucl"ion. The address of the return 
instrucHon is stored in the return instruction address tcble, and thetaget 
pointer identiod to the subsequent instruction is deleted from the return 
target stock. Lastly, if the address of the current instrucHon is stored to the 
return instruction address tcble, the address on the top layer of the return 
target stock is cBsignedos the address of the next instruction. 

(Pcra 15) According to one embodiment of the presentinvention, since only a 
content address of the program counter is needed to return to the target 
buffer for predicting the target pointer of the return instruction, the prediction 
result is thus provided in the fetch stage of a pipeline. On the other hcnd, 
since every timecn instruction is cdled in the embodiment, a correct return 
address is pre-stored in the return taget stack for prediction purpose, thus 
when a complicated program is executed, e.g. multiple progr cm sections shae 
a common subroutine, the prediction is still performed correctly. 

Brief Desaiption of Dravvings 

(Pcra 16) FIG.1 is cn ex emplay infinite loop progr cm section. 

(PctqI?) F/G2 is a block diagram illustrating the structure according to one 

embodiment of the present invention. 

(Pcra 18) FIGS is a schematic flowchat illustrating method steps forjudging 
a return instruction. 

(Pcra 19) FIG.4 is a schematic flowchat illustrating method steps for 
predicting a target pointer according to one embodiment of the present 

invention. 

(Pcra 20) FIG5 is a schematic diagr cm depicting a return instruction address 

table according to one embodiment of the present invention. 
(Pcra21) FIG6 is a schematic diagr cm depicting a return target stock 
according to one embodi ment of the present invention. 



Page 4 of 1 9 



(Pcra22) F/G7A to 7F are schematic dicgroms illustrating steps of executir^g 
progrcm sectior^ ir^ FIG. 7 according to one embodi ment of tine present 
invention. 

Detdled Desaiption 

(Pcra23) Referring to F/G 2, it illustrates a progrcm structure for predicting a 

target pointer of a return instruction according to one embodiment of tine 
present invention. Tine structure comprises a return taget buffer 220 serving 

to predict the taget pointer of the return instruction, \A/herein the return taget 
buffer 220 comprises a return instruction address tcble230 end a return 
taget stack 240. The instruction obtdned at fetch stage is stored to the 
instruction buffer 270, end is identified at decode stage by the decoder 280, 
and wherefrom return taget pointer is extracfed. 

(Pcra 24) Referring to FIG.3, the method end cppaatus for identifying return 
instruction is desaibed herein, i.e. building a return instruction address tcble. 
The method steps according to one embodi ment of the present invention 
comprise stating with step 302 START, fetching current instruction in step 
320, and identifying if the current instruction is aodi instruction. If it is, 
proceeding to step 302, other\A/ise proceeding to step 308. If it is aodI 
instruction, the address of the current instruction is added with the length of 
the current instruction for obtdning a return taget pointer to store in the 
return taget stack in step 306. Then cn address of a subsequent instruction 

is checked if it is stored in the return taget stack after the current instruction 
is executed in step 308. If it is, proceeding to step3 10 end checking whether 

the current instruction is a return instruction, Storing the address of the 
current return instruction to the return instruction address tcble in step 3 12. 
In step3 74, the taget pointer identicd to the subsequent instruction is deleted 
from the return taget stack in step 3 14, end proceeding to the END step 3 16. 
If the checking step in step 308 goes to negative, then directly proceeding to 
step 3 76 to END. 

(Pcra25) Referring to F/G4, It Illustrates aflowchat of a method and cn 
cppaatus thereof for predicting return instruction taget pointer. I.e. how to 
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predict" by using a return instruction address tcble. FIG.4 is cn extension of 
FIG. 3. According to cn embodiment of the present invention, tine metlnod end 

oppaotus finer eof for predicting return instruction target pointer is identicai to 
FIG.3 before step 402, i.e. the method end cppaatus for predicting return 
instruction taget pointer is identiod. Step 3 76 is concatenated to step 402 
and 404 in FIG4, where cn address of the current instruction is checl<ed to be 
stored in the return instruction address tabic in step 420 after step 3 16 is 

performed. If yes, an address on topmost layer of the return taget ici/er is 
assigned as the address of the next instruction in step 404, i.e. the address on 

topmost layer of the return taget stock is predicted as the taget pointer of 
the return instruction. The extracted return instruction target pointer is the 
address to fetch next instruction cs passed to the progr an counter 250. 

(Pcra26) According to the cbove des a iptlons, the step for predicting taget 
pointer merely comprises providing a content address of the progr cm counter 
250, that is, prediction an be done at fetch stage of the pipeline, \A/hich 

reduces idling stages, end improves performcnoeof the miaoprooessor and 
digitd signd processor. 

(Pcra27) Referring to FIG.5, it illustrates a schematic diagram of adetdled 
return instruction address tcble 230, According to one embodiment of the 
present invention, the return instruction address 230 comprises merely four 

rows, yet cn abltray number of rows is within the scope of the present 
invention. Wherein each row comprises cn effective flag 5 70 and cn address 
column 520. All effective flags 5 70 aedeaed at Inltldlzatlon, Indicating that 
no addresses ore induded in the return instruction address tcble230. If on 
address is to be added, it is to be stored in the address column 520 of one of 
the rows, end setting the effective flag 5 70 of the row. Since the oopodty of 
return Instruction address tcble 230 Is limited. If the tcble Is fully loaded and a 
new address Is to be added, one of the old address stored therein needs to be 
replaced. For those skilled in the at. It Is simple to stereo new address to the 
return instruction address tcble 230 to replace cn old address, e.g. odrcuia 

replcdng method for r epi ad ng the oldest address In the return Instruction 
address tcble 230 with a new one. 
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(Pcra28) Referring to FIG.5, it d so comprises Inow to clnecl< winetlner tine 

returr^ ir^struction address tcbie ir^dudes tine address of thecurrer^t ir^structior^. 
Ir^ FIG.5, each of the rows oorrespor^ds to a oompaator 530, ^^A^\dn 
simultaneous compared I content address 550 from progrcm counter. TInen 
tine compaing outcome of each row is passed to cn OR gate 540 for a totd OR 
operation, end the output of the OR gate 540 is the checking result560. 

(Pcra29) Referring to FIG.6, it illustrates a detd led schematic diagram of the 
return taget stack 240. The return taget stack 240 comprises merely four 
rows according to one embodiment of the present invention, yet cn arbitray 
number of rows is within the scope of the present invention. Each of the rows 
comprises cn effective flag 6 70 and cn address column 620. All of the 
effective flags 610 aedeaed at initidization, indicating the return taget 
stack 240 contdns no addresses. According to the embodiment of the present 
invention, each row of the stack 240 except the bottom row is shifted down for 
each time a new address is added, induding the effective flag 610 end the 
address column 620 of each of the rows. The bottom row is overwritten with 

by the content of the second last row. A new address is thus written to the 
topmost row of the address column 620, end its corresponding flag 610 is 

thus set. 

(PcraSO) According to one embodiment of the present invention, the address 
extracted from the stack 240 is dwa/s from the address column 620 of the 

topmost row, which is cn opposite operation to adding a new address. From 
the second row to the bottom row of the stack 240 ae shifted up 

simultaneously, induding the effective flag 610 end the address column 620 

of each row. T he content of the first row is overwritten by the second row, and 
the effective flag 6 10 of the bottom row is deered. 

(Pcra31) According to enother embodiment of the present invention, the 
return target stack 240 is drcula queue, wherdn when the queue is full the 

most historic current address is replaced with the latest address. 
(Pcra32) Referring to F/G.6, it dso illustrates how to check whether the return 
target stack 240 comprises a taget pointer of return instruction. In FIG.6, 
each row corresponds toacompaator 630 which simultcneously compae 
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content of the address column 620 of each of the rows with the target pointer 
650 of the current return instruction. Then the comparing res uits from each 
of the row are passed to cn OR gate 640 for a thorough OR operation, where 
the output of the OR gate 640 is the find checking resuit 660. 

(Pcra33) The method in the present invention is different from a 
conventiond brcnch taget buffer, where n method according to the present 
invention predsely performs prediction aoomplioated condition as shown in 
FIG. 7. Providing the length of theodi Instruction Is four bytes, the program Is 
executed from address 7 700. Firstly, when executing to address 7 700, the 
content address of the current progrcm counter Is added by 4 by the first call 
instruction, that is 7 704 is put back to the taget stack cs depicted in FIG.7A. 
As executing to the return Instruction of the address 7600, the address 7600 is 
added to the return Instruction address tcble, end the target pointer 7 704 Is 
ddeted from the return target stack as depicted In FIG.7B. As executing to the 
cd I Instruction of the address 7200, the address 7204 Is added to the return 
target stack as depicted FIG7C. When executing the return instruction of the 
address 7600 for the second time, the address 7600 Is d ready listed in the 
return Instruction address table, thus the topmost address 7204 of the return 

taget stack Is directed fetched as the predicting result, which Is a correct hit 
as depicted In FIG7D. As executing theodi Instruction of the address 7 700 
forthesecond time, the address 7 704 Is added to the return target stack as 
depicted in FIG7E. Lastly, when executing the return Instruction of the 
address 7600 for the third time, since the address 7600 Is d ready listed In 
the return instruction address table, the address 7 704 on the topmost row of 

the return taget stack is directly fetched as the predicting resuit, which is 
cgdn a correct hit as depicted In FIG7F. 

(Para 34) According to theabovedesalptlons and embodiments, the method 
and structure thereof provided In the present Invention Is able to predsely 
predict target pointer of the returned Instruction at the first fetch stage of a 

pi pel i ne. 

(Pa'a35) Thecbovedesalptlon provides a full and oompletedesalptlon of 
the preferred embodiments of the present Invention. Valous modifications. 



Page 8 of 1 9 



alterncit"e oonstruction, end equivdent may be made by those skilled In the at 
without changing the scope or spirit of the invention. Accordingly, thecbove 
desaiption end illustrations should not be construed as limiting the scope of 
the invention \A/hich is defined by the foil o\A/ing ddms. 
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